Finish migration of `vars_funs` module for Python package #32

jeancochrane · 2024-11-26T22:54:43Z

This PR builds on #31 by defining a vars_recode function in the Python package, which completes the migration of the vars_funs module to Python.

… out

… test it out" This reverts commit 9cc256b.

…ckage

…overage workflow

dfsnow · 2024-12-02T16:54:43Z

python/ccao/vars_funs.py

+    human-readable format. For example, EXT_WALL = 2 will become
+    EXT_WALL = "Frame + Masonry". Note that the values and their translations


nitpick: It's wrong in the R docs too, but EXT_WALL = 2 is "Masonry", not "Frame + Masonry".

Oops, thanks! Fixed in d135b9a.

dfsnow · 2024-12-02T16:56:37Z

python/ccao/vars_funs.py

+def vars_recode(
+    data: pd.DataFrame,
+    cols: list[str] | None = None,
+    code_type: str = "long",
+    as_factor: bool = True,
+    dictionary: pd.DataFrame | None = None,
+) -> pd.DataFrame:


I'd really like to keep the interfaces across the R and Python versions of our major packages the same. Maybe we can rename the R function inputs here and then release a major version as we did with assesspy. While we're at it we can trim out some of the unused functionality of the R version.

python/ccao/vars_funs.py

dfsnow · 2024-12-02T17:11:16Z

python/ccao/vars_funs.py

+    if dictionary.empty:
+        raise ValueError("dictionary must be a non-empty pandas DataFrame")
+
+    required_columns = {"var_code", "var_value", "var_value_short"}


issue (blocking): var_name, var_type, and var_data_type should be required as well since they're used below.

Edit: Plus all the var_name_ columns.

That makes sense, I added stricter validation in f5ee577. Note that we check for any column matching the pattern var_name_*, since technically the function doesn't care what the names are, or how many of them there are.

dfsnow · 2024-12-02T17:18:49Z

python/docs/source/reference.rst

+Dictionaries
+^^^^^^^^^^^^
+
+Lookups for numeric codes used in the assessment system


nitpick: Also used for vars_rename, so not just for the numeric code lookup.

Good point, clarified in 2a82c96.

python/tests/test_vars_funs.py

dfsnow · 2024-12-02T17:46:43Z

python/ccao/vars_funs.py

@@ -7,7 +7,7 @@

 # Load the default variable dictionary
 _data_path = importlib.resources.files(ccao.data)
-vars_dict = pd.read_csv(str(_data_path / "vars_dict.csv"))
+vars_dict = pd.read_csv(str(_data_path / "vars_dict.csv"), dtype=str)


issue (non-blocking): Won't this prevent matching to numeric values in the original data? i.e. 1 instead of "1".

dfsnow · 2024-12-02T17:48:37Z

python/tests/test_vars_funs.py

+            {
+                "input": {
+                    "athena": {
+                        "name": "char_bsmt",
+                        "value": ["1", "3", "4", "5"],
+                    },
+                    "iasworld": {
+                        "name": "bsmt",
+                        "value": ["1", "3", "4", "5"],
+                    },
+                },


issue (blocking): We should add a test for numeric values (and update the function to handle those) or make it clear to the user that vars_recode expect all values to be strings.

Good thinking, although I'm going to wait to update anything until we decide on what the expected behavior should be in this thread.

This reverts commit 5637158.

Co-authored-by: Dan Snow <[email protected]>

jeancochrane · 2024-12-03T20:09:25Z

@dfsnow This should be ready for another look!

dfsnow

Alright, this looks set to go now!

…e-spark-compatible Make the the Python package compatible with Athena PySpark

codecov · 2024-12-04T22:13:00Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (969dae7) to head (c771664).
Report is 6 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff            @@
##            master       #32   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            6         6           
  Lines          382       382           
=========================================
  Hits           382       382

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jeancochrane and others added 30 commits November 20, 2024 13:39

Add basic Python project with just vars_rename

0231af4

Add unit tests for vars_rename

a738b8b

Add pytest-coverage workflow

adb87a6

Add Development docs to python/README.md

c0739e8

Clean up docs in vars_funs.py

6d63b51

Fix typo in pytest-coverage workflow

f3f38e1

Accept any python >=3.9 in python package

abdeca0

Use optional-dependencies for dev deps in pyproject.toml

ec0a63e

Fix vars_rename docstring in Python package

08e418a

Update typing in vars_funs.py to be compatible with Python 3.9

001b079

Add Sphinx docs for Python package

2aae031

Update actions/checkout versions across workflows

4513bcb

Add Python docs generation to docs workflow

44fd662

Fix ruff linter errors

4070f6f

Install both test and docs requirements when running pytest

854aefc

Fix paths in pytest-coverage workflow

8afc1d4

Better path management in docs conf.py

2549daf

Rename build jobs in docs workflow

723834e

Include csv files in package data when building Python package

c323370

Temporarily disable branch restriction for docs deployment to test it…

9cc256b

… out

Update deploy-pages version

eb9a619

Revert "Temporarily disable branch restriction for docs deployment to…

bbbaf68

… test it out" This reverts commit 9cc256b.

Fix broken link in Python docs

b0055b4

Switch to new style python type hints since we don't support 3.9 anyway

be12f3e

Remove unnecessary templates_path config from pyproject.toml

bd6835d

Empty commit to try to bust build-pkgdown-site actions cache

5eb1e23

Draft Python version of vars_recode

1f7290e

Remove unnecessary .python-version file

1b6bbef

Add pip install directions to README and index.rst for Python package

bca6864

Remove unnecessary uv.lock file

5960235

jeancochrane marked this pull request as ready for review November 26, 2024 23:22

jeancochrane requested a review from a team as a code owner November 26, 2024 23:22

jeancochrane requested a review from dfsnow November 26, 2024 23:23

jeancochrane added 3 commits November 27, 2024 18:29

Add python/ subdir to RBuildignore so it does not get built into R pa…

a8c3233

…ckage

Support Python 3.9, pandas 1.4, and numpy 1.23

7505970

Try installing pandas/numpy before the other dependencies in pytest-c…

df3af62

…overage workflow

dfsnow reviewed Dec 2, 2024

View reviewed changes

jeancochrane and others added 11 commits December 2, 2024 18:33

Try building and testing Python package with tox

e34b6d7

Add UV_CACHE_DIR to tox env to see if it speeds up builds

5637158

Revert "Add UV_CACHE_DIR to tox env to see if it speeds up builds"

0e29e6f

This reverts commit 5637158.

Restrict tox envs since 3.11 seems to need to build a dep from source

b410f6e

Update docs to fix incorrect EXT_WALL code translation

d135b9a

Clarify docs for vars_dict data object in reference.rst

2a82c96

Stricter dictionary schema validation in Python version of vars_recode

f5ee577

Remove outdated comment in python/ccao/vars_funs.py

67ea0bb

Co-authored-by: Dan Snow <[email protected]>

Fix wheel caching on CI when using uv in Python package

b9f300c

Speed up Python install with uv in docs.yaml

815b6a8

Pass env vars to tox defensively

0890d98

jeancochrane mentioned this pull request Dec 3, 2024

Update vars_funs interface in the R package to match the Python package #34

Merged

jeancochrane requested a review from dfsnow December 3, 2024 20:09

dfsnow approved these changes Dec 4, 2024

View reviewed changes

jeancochrane and others added 5 commits December 4, 2024 15:32

Merge pull request #33 from ccao-data/jeancochrane/make-python-packag…

ca15900

…e-spark-compatible Make the the Python package compatible with Athena PySpark

Remove UV_SYSTEM_PYTHON env var from docs workflow

dcf038f

Add shell: bash config to Build Python docs step of docs workflow

c2ab1ff

Add tmate to docs workflow for debugging

dd2d922

Run sphinx-build from the correct working directory in docs workflow

c771664

jeancochrane merged commit 2b027d4 into master Dec 4, 2024
17 checks passed

jeancochrane deleted the jeancochrane/further-python-package-migration branch December 4, 2024 22:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finish migration of `vars_funs` module for Python package #32

Finish migration of `vars_funs` module for Python package #32

jeancochrane commented Nov 26, 2024 •

edited

Loading

dfsnow Dec 2, 2024

jeancochrane Dec 2, 2024

dfsnow Dec 2, 2024

dfsnow Dec 2, 2024

jeancochrane Dec 2, 2024

dfsnow Dec 2, 2024

jeancochrane Dec 2, 2024

dfsnow Dec 2, 2024

dfsnow Dec 2, 2024

jeancochrane Dec 3, 2024

jeancochrane commented Dec 3, 2024

dfsnow left a comment

codecov bot commented Dec 4, 2024 •

edited

Loading

		human-readable format. For example, EXT_WALL = 2 will become
		EXT_WALL = "Frame + Masonry". Note that the values and their translations

Finish migration of vars_funs module for Python package #32

Finish migration of vars_funs module for Python package #32

Conversation

jeancochrane commented Nov 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeancochrane commented Dec 3, 2024

dfsnow left a comment

Choose a reason for hiding this comment

codecov bot commented Dec 4, 2024 • edited Loading

Codecov Report

Finish migration of `vars_funs` module for Python package #32

Finish migration of `vars_funs` module for Python package #32

jeancochrane commented Nov 26, 2024 •

edited

Loading

codecov bot commented Dec 4, 2024 •

edited

Loading